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Abstract — Recently Lin et al. proposed a method of using 
the underdetermined BSS (blind source separation) problem to 
realize image and speech encryption. In this paper, we give a 
cryptanalysis of this BSS-based encryption and point out that it 
is not secure against known/chosen-plaintext attack and chosen- 
ciphertext attack. In addition, there exist some other security 
defects: low sensitivity to part of the key and the plaintext, 
a ciphertext-only differential attack, divide-and-conquer (DAC) 
attack on part of the key. We also discuss the role of BSS in Lin 
et al.'s efforts towards cryptographically secure ciphers. 

Index Terms — blind source separation (BSS), speech encryp- 
tion, image encryption, cryptanalysis, known-plaintext attack, 
chosen-plaintext attack, chosen-ciphertext attack, differential 
attack, divide-and-conquer (DAC) attack. 



L Introduction 

With the rapid development of multimedia and networking 
technologies, the security of multimedia data becomes more 
and more important in many real applications. To fulfill such 
an increasing demand, during past decades many encryption 
schemes have been proposed to protect multimedia data, 
including speech, images and videos [l]-[9]. 

According to the nature of protected data, multimedia 
encryption schemes can be classified into two basic types: 
analog and digital. Most early schemes were designed to 
encrypt analog data in various ways: element permuting, signal 
masking, frequency shuffling, etc., all of which may be exerted 
in time domain or transform domain or both. However, due 
to the simplicity of the encryption procedures, almost all 
analog encryption schemes are not sufficiently secure against 
cryptographical attacks, especially those modern attacks such 
as known/chosen-plaintext and chosen-ciphertext attacks [2], 
[3], [10], [11]. As a comparison, in digital encryption schemes, 
one can employ any cryptographically strong cipher, such 
as DES [12] or AES [13], to achieve a higher level of 
security. Besides, to achieve a higher efficiency of encryption 
and some special demands of multimedia encryption (such 
as format-compliance [14] and perceptual encryption [15]), 
many specific multimedia encryption schemes have also been 
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developed [4]-[6]. Recent cryptanalysis work [16]-[30] has 
shown that some multimedia encryption schemes are insecure 
against various cryptographical attacks. 

Recently Lin et al. suggested employing blind source sepa- 
ration (BSS) for the purpose of image and speech encryption 
[31]-[37]. The basic idea is to mix multiple plaintexts (or mul- 
tiple segments of the same plaintext) with a number of secret 
key signals, in the hope that an attacker has to solve a hard 
mathematical problem - the underdetermined BSS problem. In 
Sec. VII of [37], Lin et al. claimed that this BSS-based cipher 
"is immune from the attacks such as the ciphertext-only attack, 
the known-plaintext, and the chosen-plaintext attack", "as long 
as the intractability of the underdetermined BSS problem is 
guaranteed by the mixing matrix for encryption". 

This paper re-evaluates the security of the BSS-based en- 
cryption scheme and points out that it is actually insecure 
against known/chosen-plaintext attack and chosen-ciphertext 
attack. In addition, some other security defects are also found 
under the ciphertext-only attacking scenario, including the low 
sensitivity to the mixing matrix (part of the secret key) and 
the plaintext, and a differential attack that works well when 
the matrix size is small. Based on the cryptanalytic findings, 
we also discuss the role of BSS in Lin et al.'s efforts towards 
cryptographically secure ciphers. 

The rest of this paper is organized as follows. In next section 
we give a brief introduction to the BSS-based encryption 
scheme. Section III is the main body of this paper and focuses 
on the cryptanalysis of the BSS-based encryption scheme. 
Then, the role of BSS in cryptography is discussed in Sec. IV. 
Finally the last section concludes this paper. 

II. BSS-BASED Encryption 

Blind source separation is a technique that tries to recover 
a set of unobserved sources or signals from observed mixtures 
[38]. Given N unobserved signals si, • • • ,SAr and a mixing 
matrix A of size N x M, the BSS problem is to recover 
Si, • • • , Sat from M observed signals xi, • • • , xm, where 



[xi, • • • ,xm]^ = A[si, • 



(1) 



When M > N, the blind source separation is possible when 
A satisfies some requirements. However, when M < N, this 
is generally impossible (whatever A is), thus leading to the 
underdetermined BSS problem. 

In [31]-[37], Lin et al. introduced a number of secret key 
signals to make the determination of the plaintext signals 
become an underdetermined BSS problem in the case that 
the key signals are unknown. Given P input plain- signals 
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• • • , 5p(t) and Q key signals • • • , /cQ(t), the 
encryption procedure is described as follows^: 

x(t) = • • • , xp{t)]^ = Asfc(t), (2) 

where x(t) denote P cipher- signals, Sk{t) = 
,sp(t),A:i(t),--- , A:g(t)]^, and A is a Px(P+Q) 
mixing matrix whose elements are within in [—1,1]. Assume 
that A = [A5,A/c], where A^ is a P x P matrix and A^ 
is a P X Q matrix. Then, the encryption procedure can be 
represented in an equivalent form: 

x(t) = A,s(t)+Afck(t), (3) 

where s(t) = • • • , sp(t)]^ and k(t) = 

[ki{t)^-'- ^kQ{t)Y . Thus, as long as A^ is an invertible 
matrix, one can decrypt s(t) as follows^: 

s(t) = A;^ (x(t) - Afek(t)) . (4) 

Different values of Q was used in Lin et al.'s papers: Q = 1 
in [31] and Q = P in [32]-[37]. When Q = P, Lin et al. 
further set A^ = B and A/^ = where ^ > 10 for image 
encryption and /3 > 1 for speech encryption. In this case, the 
encryption procedure becomes 

x(t)=B(s(t)+/5k(t)), (5) 

and the decryption procedure becomes 

s(t) = B-^x(t)-;5k(t). (6) 

Observing Eq. (3), one can see that the encryption procedure 
contains two steps: 

. Step 7: yS^\t) = Ass{t); 

. Step 2: x(t) = x(^)(t) + Akk{t). 
The first step corresponds to a substitution (block) cipher, and 
the second step corresponds to a additive stream cipher. From 
another point of view, the two steps are exchanged as follows: 

. Step 1: x^i)(t) = s(t) + A;^Afck(t); 

. Step 2: x(t) = A5X^i)(t). 
In any case, the BSS-based encryption scheme is always a 
product cipher composed by a simple block cipher and an 
additive stream cipher. In next section, we will show that the 
two sub-ciphers can be separately broken by known/chosen- 
plaintext attack and chosen-ciphertext attack. 

In the BSS-based encryption scheme, the key signals 

• • • , kQ{t) are as long as the plain-signals and have to 
be generated by a pseudo-random number generator (PRNG) 
with a secret seed Iq, which serves as the secret key. In Lin 
et al.'s papers, it was not explicitly mentioned whether or not 
the mixing matrix should be used as part of the secret key. 
However, if the attacker knows A, the product cipher degrades 
to be a stream cipher. Considering x*(t) = A7^x(t) as the 
equivalent cipher- signal, the encryption procedure becomes 

x*(t) = s(t)+A;^Afck(t). (7) 

^To achieve a clearer description of the BSS-based encryption scheme, in 
this paper we use some notations different from those in Lin et al's original 
papers. For example, in [37], the i-th key signal is denoted by Sni(t), while 
in this paper we use ki{t) to emphasize the fact that it is a key signal. 

^In Lin et al.'s papers, it is said that the decryption procedure was 
achieved via BSS. However, from the cryptographical point of view, it is 
more convenient to denote the decryption procedure by Eq. (4). 



In this case, the encryption scheme is actually independent 
of the underdetermined BSS problem. In addition, as we 
shown later in Sec. Ill- A. 5, the key signals can be totally 
circumvented in a ciphertext-only differential attack, so the 
mixing matrix A must be kept as the secret key. Thus, in this 
paper we assume that the secret key consists of both Iq and 
A. 

In [31]-[35], the BSS-based encryption scheme was mainly 
designed to encrypt P images simultaneously, where Si{t) is 
the t-th pixel in the i-th image. In [36], [37], the encryption 
scheme was suggested to encrypt a single speech, each frame 
of which is divided into P segments and Si{t) is the t-th 
sample in the i-th segment. This encryption scheme can also 
be applied for a single image, by dividing it into P blocks 
of the same size. To facilitate the following discussion, we 
assume that the encryption scheme is used to encrypt a single 
plaintext with P segments of equal size. 

In Sec. VII of [37], Lin et al. claimed that the BSS- 
based encryption scheme is secure against most modern cryp- 
tographical attacks, including the ciphertext-only attack, the 
known-plaintext attack, and the chosen-plaintext attack. In 
next section we will show that this claim is problematic. 

III. Cryptanalysis 

Before introducing the cryptanalytic results, let us see how 
large the key space is. In Lin et al.'s papers, each element of A 
is within the interval [—1,1]. Then, assuming that each element 
in A has R possible values^, the number of all possible mixing 
matrix A is Furthermore, assuming that the bit size 

of Iq is L, the size of the whole key space is R^^^^^^2^ . 
When Q = P and A = [B,^B], the size of the whole key 
space is 2^. Later we will show that the real size of the 
key space is much smaller than this estimation, due to some 
essential security defects of the BSS-based encryption scheme. 
We will also point out that the encryption scheme under 
study is not secure against known/chosen-plaintext attack and 
chosen-ciphertext attack. 

A. Ciphertext-Only Attack 

1) Divide-and-Conquer (DAC) Attack: Rewriting Eq. (4) in 
the following form: 

s(t) = Axfe(t), (8) 
where Xfe(t) = [xi{t),--- , xp{t), ki{t), ■ ■ ■ ,kQ{t)]^ and 

A = a;^ [I, -A,] = [a;\ -a;' a,] . 

From the above equation, to recover Xi{t), one only needs to 
know k(t) and the i-th row of A. In other words, when the 
BSS-based encryption scheme is used to encrypt P indepen- 
dent plaintexts, the i-th plaintext can be exactly recovered with 
the knowledge of Iq and the i-th row of A. A similar result 

^The value of R is determined by the finite precision under which the 
cryptosystem is reahzed. For example, if the cryptosystem is implemented 
with n-bit fixed-point arithmetic, R = 2^; if it is implemented with IEEE 
floating-point arithmetic, R ^ 2^^ (single-precision) or R ^ 2^^ (double- 
precision) [39], where note that the sign bit of the floating-point number is 
always negative. 
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can be obtained when P segments of one single plaintext is 
encrypted with the encryption scheme. This fact means that 
P rows of A can be separately broken with a divide-and- 
conquer (DAC) attack. As a result, the size of the key space is 
reduced to be PR^^^^h^. When Q = P and A = [B,/5B], 
it becomes PR^2^. 

2 ) Low Sensitivity to A: From the cryptographical point of 
view, given two distinct keys, even if their difference is the 
minimal value under the current finite precision, the encryption 
and decryption results of a good cryptosystem should still 
be completely different. In other words, this cryptosystem 
should have a very high sensitivity to the secret key [12]. 
Unfortunately, the BSS-based encryption scheme does not 
satisfy this security principle, because the involved matrix 
computation is not sufficiently sensitive to matrix mismatch. 
Given two matrices Ai and A2 of size M x TV, if the maximal 
difference of all elements is s, then one can easily deduce 
that each element of |Ais(t) — A2s(t)| is not greater than 
A^max(s(t))£. As a result, the matrix A can be approxi- 
mately guessed under a relatively large finite precision e, still 
maintaining an acceptable quality of the recovered plaintexts. 
This immediately leads to a significant reduction of the size of 
the key space: from PR^p+Q)2^ to P[2/£] (^+^^2^, where 

The above low sensitivity can be easily verified with exper- 
iments described as follows: 

• Step 1: for a randomly-generated key (A,Io), calculate 
the ciphertext x(t) corresponding to a plaintext s(t); 

• Step 2: with another mismatched key (A + eR, Iq), 
decrypt x(t) to get s{t) - an estimated version of s(t), 
where e G (0, 1) and R is a Px (P+Q) random (1, -1)- 
matrix. 

For each value of e, the second step was repeated for 100 
times to get a mean value of the recovery error (measured 
in MAE - mean absolute error)"^. Then, we can observe the 
relationship between the recovery error and the value of e. 
Figure 1 shows the experimental results when the plaintexts 
are a digital image and a speech file, respectively. 

The experimental results confirms that a mismatched key 
can approximately recover the plaintext. Considering that 
humans have a good capability of resisting errors in images 
and speech, even relatively large errors may not be able to 
prevent a human attacker from recognizing the plain-image or 
plain- speech. Thus, the value of e may be relatively large. 
When P = 4, A = [B,/3B] and e = 0.1, we give two 
examples of such recognizable plaintexts with relatively large 
errors in Figs. 2 and 3. 

From the above experimental results, we can exhaustively 
search for an approximate version of A under the finite 
precision e — 0.01 ~ 0.1. Such an approximate version of 
A is then used to roughly reveal the plaintext. Considering 
the searching complexity is O (5:"'^^+^^), such an exhaustive 
search is feasible when P, Q is not very large^. When P = 

"^When the plaintext is a digital image with 256 gray scales, we first calibrate 
each sub-image into the range {0, • • • , 255} and then calculate the recovery 
error of the whole image. 

^In [31]-[37], small values are used in all examples: P = 2 or 4 and 
Q<P. 
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Legend: * - P = Q = 4; o - P = 4 and A = [B, ^^B] 
(P = 10). 
a) 
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Legend: * - P = Q = 4; o - P = 4 and A = [B, ^^B] 
(P = 2). 
b) 

Fig. 1. The experimental relationship between the recovery error and the 
value of £: a) the plaintext is a digital image "Lenna" (Fig. 3a); b) the plaintext 
is a speech file "one.wav" that corresponds to the pronunciation of the English 
word "one" (from Merriam- Webster Online Dictionary, http://www.m-w.com). 

2 and A = [B,/3B], we carried out a large number of 
experiments in the following steps: 

• Step 1: for a randomly-generated key (B,Io), calculate 
the ciphertext x(t) corresponding to a plaintext s(t); 

• Step 2: randomly generate a matrix R (each element 
over the interval [—1, 1]), and then decrypt x(t) with the 
guessed key (R, lo) to get s(t); 

• Step 3\ repeat Step 2 for r rounds, output the recovered 
plaintext s*(t), every segment of which corresponds to 
the best recovery performance in all the r rounds; 

• Step 4: for the i-th segment of s*(t), find the correspond- 
ing matrix R, extract its i-th row of its inverse R~^ to 
form the i-th row of B , the inverse of an estimation 
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Fig. 2. An example of human capability against large noises in 
speech. From top to bottom: the original plain- speech "one.wav", 
the recovered speech, the recovery error (MAE=0.164103). 
For reader's sake, the recovered speech is posted online at 
http://www.hooklee.com/Papers/Data/BSSE/one_MAE=0. 164 103.wav. 




Fig. 3. An example of human capability against large noises in images: a) 
the original plain-image "Lenna"; b) the recovered image (MAE=47.6913). 



of the original matrix B. 

Assuming that the target finite precision is £ > 0, the interval 
[—1,1] is divided into = \2/£\ sub-intervals. Without 
loss of generality, assuming that 2/e is an integer, then each 
sub-interval is of equal size. Thus, if the element in the 
random matrix R has a uniform distribution over [—1,1], the 
probability that \ri^j — < £ occurs at least one time in r 
rounds of experiment is p{n^^ r) = 1 — (1 — l/n^)^, where r^^j 
and ai^j are the (i,j')-th elements of R and A, respectively. 
One can easily deduce that p{n^^ r) is an increasing function 
with respect to r and 

p{ns,ns)> lim p{ns,ns) = 1 - lim (1 - l/n^)"^^ 

Tie — >00 Tie — ^OO 

= 1 - ^ 0.6321, 

which leads to the result that p(n£, r) > 1 — when r > 
n^. In other words, with r > experiments, it is a high- 
probability event that we have at least one Tij "equal" to aij 
under the finite precision e. To get an approximate estimation 
of the i-th row of A, we can see that r = O (nf ) rounds of 
experiment are needed. 



Apparently, the above steps actually simulate the process of 
a real ciphertext-only attack that tries to reveal the plaintext 
and to exhaustively guess (under the assumption that 

lo has been known). Note that MAE cannot be calculated to 
evaluate the recovery performance in a real attack, in which 
one does not know the plaintext. Fortunately, exploiting the 
large information redundancy existing in natural images and 
speech, one can turn to use some other measures to reflect 
the recovery performance of each segment of s{t). In our 
experiments, we use a measure called MANE (mean absolute 
neighboring error), which is defined as follows for the i-th 
segment of s{t) 

1 \s^{t) - - 1)1 + \s^{t) - S,{t + 1)1 



where T denotes the segment length. In Figs. 4 and 5, one 
recovered plain- speech and two recovered plain-images are 
shown for demonstration. One can see that r = 0(10,000) 
(or e ^ 0.01) is sufficient to get a good estimation of the 
plaintext. 




0.1 0.2 0.3 0.4 0.5 
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Fig. 4. A recovered speech in one 50,000-round experiment of 
exhaustively guessing A when P = 2 and A = [B,/3B]. 
From top to bottom: the original plain-speech "one.wav", the recov- 
ered speech (MANE of each segment: 0.0469, 0.0521), the recov- 
ery error. For reader's sake, the recovered speech is posted online at 
http://www.hooklee.com/Papers/Data/BSSE/one_MANE=0.0469-0.0521.wav. 

Note that for 2-D images the above 1-D MANE may be 
generalized to include more neighboring pixels, thus achieving 
a more accurate description of the recovery performance. 
In addition, multiple quality factors can be employed to 
further increase the efficiency of evaluation of the recovery 
performance. 

3) Low Sensitivity to k(t).' Due to the same reason of 
the low sensitivity to A, one can deduce that the BSS-based 
encryption scheme is also insensitive to the key signal k(t). 
Given two key signals ki(t) and k2(t), if the maximal differ- 
ence of all elements is e, each element of | A/cki(t)— A/ck2(t)| 
is not greater than Q max(| A/c|)£ = Qe. Since k{t) itself is 
not part of the secret key, but generated from Iq, this problem 
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Fig. 5. Two recovered plain-images in our experiments of exhaustively 
guessing B when P = 2 and A = [B^/SB]: a) r = 1,000 (MANE of 
each segment: 39.7491, 14.9373); b) r = 10, 000 (MANE of each segment: 
16.3888, 15.1722). 



does not have much negative influence on the security of the 
whole cryptosystem against ciphertext-only attacks. 

4) Low Sensitivity to Plaintext: Another cryptographical 
property required by a good cryptosystem is that the encryp- 
tion is very sensitive to plaintext, i.e., the ciphertexts of two 
plaintexts with a slight difference should be much different 
[12]. However, this property does not hold for the BSS-based 
encryption scheme. Given two key signals si{t) and S2(t), if 
the maximal difference of all elements is £, each element of 
|AsSi(t) — AsS2(t)| is not greater than Pmax(|As|)£ = Pe. 
When the same secret key is used to encrypt two close- 
correlated plaintexts, such as a plaintext and its watermarked 
version, this security defect means that the exposure of one 
plaintext leads to the revealment of both. 

5) Differential Attack: Given two plaintexts s^^\t) and 
s'^^^(t), if they are encrypted with the same key (A,Io), we 
can get the following formula from Eq. (3): 



Ax(t) = A,As(t), 



(10) 



where A^{t) = ^^^\t)-^^^\t) and As{t) = s^^\t) -s^^\t) . 
Note that A/ek(t) disappears in the above equation. This 
means that from the differential viewpoint only Ag is the 
secret key, i.e., Iq is removed from the key. Considering the 
low sensitivity of the encryption scheme to A, under finite 
precision e the key space becomes O (Pe"^), and one might 
exhaustively search A^ to recover the plaintext differential as 
follows: 

As(t)=A;^Ax(t). (11) 

From the obtained plaintext differential, one can get a mixed 
view of the two interested plaintexts, from which both plain- 
texts may be completely recognizable by humans. See Figs. 6 
and 7 for four plaintext differentials of two speech files and 
two images. 

Denoting the guessed matrix by A^, we have 



As(t)=A, Ax(t)=A, A,As(t). 



(12) 



Apparently, if A^ ^ A^, the obtained plaintext differential 
As(t) will have an inter-segment mixture, which may make the 
recognition of the two plaintexts more difficult. Fortunately, 
when P is relatively small, such an inter-segment mixture 
may not be too severe to prevent the recognition of the 




0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 



Fig. 6. Differentials of two plain-speech files. From top to bottom: the first 
speech "one.wav", the second speech "two.wav", the differential one-two, the 
differential two-one. For readers' sake, the two differential speech files are 
posted online at http://www.hooklee.com/Papers/Data/BSSE/one-two.wav and 
http://www.hooklee.eom/Papers/Data/B S SE/two-one . wav. 




Fig. 7. Differentials of two plain-images, 
Lenna-cameraman; b) cameraman-Lenna. 



cameraman 



two plaintexts by humans. More importantly, our experiments 
showed that humans can even be able to recognize the two 
plaintexts even when the mismatch between A 5 and Ag is 
not very small. When P = 2, 



A. = 



0.7123 
0.1958 



-0.4272 
0.1295 



A. = 



0.5914 
0.5726 



0.9527 
0.1437 



, (13) 



a plaintext differential obtained in our experiments is shown 
in Fig. 8. One can see that both plain-images, "Lenna" and 
"cameraman", can still be roughly recognized from such 
a heavily mixed differential. Another obtained plain- speech 
differential for "one.wav" and "two.wav", is shown in Fig. 9, 
from which the two English words ("one" and "two") are also 
perceptible. 

In this differential attack, the quality evaluation factors (such 
as MANE) used in Sec. III-A.2 is not suitable to automatically 
determine the best result in many plaintext differentials, be- 
cause each segment of the obtained plaintext differential is 
also a natural signal with abundant information redundancy. 
Instead, one has to output all obtained differentials, and check 
them with naked eyes or ears to find a perceptually-optimal 
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Fig. 8. One obtained plain-image differential when As 
relatively large mismatch as shown in Eq. (13). 



and As have a 




Fig. 9. One 
a relatively lari 
posted online 
mismatch.wav. 



obtained plain- speech differential when As and As have 
•ge mismatch. For readers' sake, this differential speech is 
at http://www.hooklee.com/Papers/Data/BSSE/two-one-large- 




Fig. 10. A visually-optimal result obtained in 100 plain-image differentials: 
a) the differential; b) the negative image of the differential. 



result with the least inter- segment mixture. Figure 10 shows 
such a result in 100 plain-image differentials when P = 2 and 
A follows Eq. (13). By checking each segment separately and 
combine the P optimal segments together, one can further get 
a better result with less inter-segment mixture. 

While this differential attack works well for P = 2 as shown 
above, it will become infeasible when P is sufficiently large, 
due to the following facts: 1) the inter-segment mixture is too 
severe; 2) the complexity of checking all O (e"^) differentials 
is beyond humans' capability. 



B. Known- Plaintext Attack 

In this kind of attack, one can access to a number of 
plaintexts that are encrypted with the same key. Then, from 
Eq. (10), with P plaintext differentials, one immediately 
knows that the mixing matrix can be uniquely determined as 
follows: 

A, = Ax(t)(AsW)-\ (14) 



where As(t) and Ax(^) are P x P matrices, constructed row 
by row from the P plaintext differentials and the correspond- 
ing ciphertext differentials, respectively. Then, A/ek(t) can be 
further solved from any plaintext and its ciphertext: 



Afck(t) =x(t) - A,s(t). 



(15) 



Now, (A5, A/ek(t)) can be used to recover other plaintexts 
encrypted by the same key (A,Io). Note that A/ek(t) has a 
finite length determined by the maximal length of all known 
plaintexts, so (A5, A/ck(t)) can only recover plaintexts under 
this finite length. 

When A = [B, ^^B], the key signals can also be determined: 



s{t) - B~^x(t) 



(16) 



If the PRNG used is not cryptographically strong (such as 
LFSR [12]), it may be possible to further derive the secret 
seed lo, thus completely breaking the BSS-based encryption 
scheme. 

Note that n distinct plaintexts can generate (2) = n{n — 
l)/2 plaintext differentials. Solving the inequality n{n — 
l)/2 > P, one can get the number of required plaintexts to 
yield at least P plaintext differentials: 



n > 



y^P - 1/4 + l/2j ^VP. 



(17) 



C. Chosen-Plaintext/Ciphertext Attack 

In chosen-plaintext attack, one can freely choose a number 
of plaintexts and observe the corresponding ciphertexts, while 
in chosen-ciphertext attack, one can freely choose a number 
of ciphertexts and observe the corresponding plaintexts. So in 
these attacks, one can choose P plaintext differentials easily, 
which means that the above differential known-plaintext attack 
still works in the same way. 

IV. Discussion 

As we pointed out in last section, the BSS-based encryption 
scheme is always insecure against plaintext attack. So the 
secret key cannot be repeatedly used in any case. This means 
that the encryption scheme has to work like a common stream 
cipher, by changing the secret key for each distinct plaintext. 
However, in this case, k(t) (equivalently, the secret seed Iq) is 
enough to provide a high level of security, since k(t) satisfies 
the cryptographical properties in a perfectly secure one-time- 
a-pad cipher (see Sec. V.B of [37]). Then, the mixing matrix 
A becomes excessive. 

Even when one wants to add a second defense to potential 
attacks by applying the BSS mixing, the low sensitivity of 
encryption/decryption to the mixing matrix A (recall Sec. III- 
A.2) makes this goal less useful. As a result, with the current 
encryption design, the BSS model does not play a key role 
in the security of the scheme. The real core of the encryption 
scheme is the embedded PRNG that is in charge of generating 
the key signals masking the plaintexts. 

If one wants to use the BSS-based encryption scheme with 
repeatedly used key, some essential modifications have to 
be made to reinforce the security against various attacks. 
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Following the cryptanalytic results given in last section, we 
suggest adopting two coutermeasures simultaneously: 1) use a 
sufficiently large P; 2) like the design of most modern block 
ciphers [12], iterate the BSS-based encryption for many rounds 
to avoid the original scheme's low sensitivity to the secret 
key and plaintext. It is obvious that both countermeasures 
will significantly influence the encryption/decryption speed of 
the encryption scheme. It seems doubtful if such an enhanced 
encryption scheme will have any advantages compared with 
other multiple-round block ciphers, especially AES [13] that 
can be optimized to run with a very high rate on PCs [40]. 

Finally, it deserve mentioning that the original BSS-based 
encryption scheme can be used to realize lossy decryption, 
an interesting feature that may find useful in some real 
applications^. This feature means that an encryption scheme 
can still (maybe roughly) recover the plaintext even when there 
are some errors in the ciphertexts. An typical use of this feature 
is that the ciphertext can be compressed with some lossy 
algorithms to save the required storage in local computers 
or the channel width for transmission. For the BSS-based 
encryption scheme, the lossy decryption feature is ensured 
by low sensitivity of decryption to ciphertext, which is due 
to the same reason of the low sensitivity of encryption to 
plaintext (recall Sec. III-A.4). However, keep in mind that the 
lossy decryption feature is induced by the low sensitivity to 
plaintext/ciphertext, so there is a tradeoff between this feature 
and security. 

V. Conclusion 

This paper analyzes the security of an image/speech encryp- 
tion scheme based on BSS mixing technology [31]-[37]. It has 
been shown that this BSS-based encryption scheme suffers 
from some security defects, including its vulnerability to a 
ciphertext-only differential attack, known/chosen-plaintext at- 
tack and chosen-ciphertext attack. It remains an open problem 
how to apply BSS technology to construct cryptographically 
strong ciphers. 
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